-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BLD: use unsigned instead of signed for lengths, avoid build warnings #26759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26759 +/- ##
==========================================
- Coverage 91.71% 91.7% -0.01%
==========================================
Files 178 178
Lines 50740 50740
==========================================
- Hits 46538 46533 -5
- Misses 4202 4207 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26759 +/- ##
==========================================
- Coverage 91.97% 91.97% -0.01%
==========================================
Files 180 180
Lines 50756 50756
==========================================
- Hits 46685 46682 -3
- Misses 4071 4074 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On board with the concept just question around casts (applicable in a few places)
pandas/_libs/src/parser/tokenizer.c
Outdated
@@ -471,7 +472,7 @@ static int end_line(parser_t *self) { | |||
return 0; | |||
} | |||
|
|||
if (!(self->lines <= (int64_t) self->header_end + 1) && | |||
if (!(self->lines <= (uint64_t) self->header_end + 1) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this cast necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I think it isn't.
pandas/_libs/src/parser/tokenizer.c
Outdated
@@ -248,7 +248,8 @@ void parser_del(parser_t *self) { | |||
} | |||
|
|||
static int make_stream_space(parser_t *self, size_t nbytes) { | |||
int64_t i, cap, length; | |||
uint64_t i, cap; | |||
int64_t length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does leaving this as is not introduce any new build warnings? Looks like assignments and comparisons are done with unsigned types within this function
pandas/_libs/src/parser/tokenizer.c
Outdated
@@ -651,7 +652,7 @@ static int parser_buffer_bytes(parser_t *self, size_t nbytes) { | |||
stream = self->stream + self->stream_len; \ | |||
slen = self->stream_len; \ | |||
self->state = STATE; \ | |||
if (line_limit > 0 && self->lines == start_lines + (int64_t)line_limit) { \ | |||
if (line_limit > 0 && self->lines == start_lines + (uint64_t)line_limit) { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to even have these casts? Fraught with peril to begin with using macros I think this just adds nuances / developer burden
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to remove these casts.
Should we consider getting rid of the macros? That's outside my region of competence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea might be worth doing so in a separate PR. Here's the cpp take on macros:
https://isocpp.org/wiki/faq/inline-functions#inline-vs-macros
I could see this one being particular troublesome with our codebase:
https://isocpp.org/wiki/faq/misc-technical-issues#macros-with-if
I think the main downside to using inline vs macros would be support for pre-C99 compilers, but I don't think that's a problem given age and the fact that we have inline functions elsewhere in the code base
OK by me. Does this offer any performance improvements? Wondering if Cython optimizes for not doing a wraparound check like in this SO question: |
pandas/_libs/src/parser/tokenizer.c
Outdated
@@ -248,7 +248,8 @@ void parser_del(parser_t *self) { | |||
} | |||
|
|||
static int make_stream_space(parser_t *self, size_t nbytes) { | |||
int64_t i, cap, length; | |||
uint64_t i, cap; | |||
int64_t length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is length signed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there weren't any warnings telling us not to. Probably makes more sense unsigned liked the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this, otherwise lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this requires changing a few other places to match. I think its benign, but not obvious. Go for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep prob should
small comment, merge master and ping on green. |
Ping |
Thanks @jbrockmendel ! |
I'm much more confident about the other two BLD PRs (#26757, #26758) than this, largely because: